|
The Google Ngram Viewer or Google Books Ngram Viewer is an online viewer, initially based on Google Books, that charts frequencies of any word or short sentence using yearly count of ''n''-grams found in the sources printed between 1500 and 2015〔〔〔〔 in American English, British English, French, German, Spanish, Russian, Hebrew, and Chinese.〔〔 Lacking an independent corpus, Italian words are counted by their use in other languages. It can search for a single word, including a misspelling, or a phrase, or gibberish.〔 The n-grams are matched by case-sensitive spelling, comparing exact uppercase letters,〔 and plotted on the graph, if found in 40 or more books.〔 It now supports searches for parts of speech and wildcards. It was developed by Jon Orwant and Will Brockman and released in mid-December 2010.〔〔 It was inspired by a prototype (called "Bookworm") created by Jean-Baptiste Michel and Erez Aiden from Harvard's Cultural Observatory and Yuan Shen from MIT and Steven Pinker.〔https://www.youtube.com/watch?v=5S1d3cNge24&feature=youtu.be&t=56m58s〕 Researchers have analyzed the Google Ngram database of books written in American or British English. Research based on the ngram database has included the finding of correlations between the emotional output and significant events in the 20th century such as World War II〔 or to check and challenge popular trend statements such as the secularisation or economisation of modern societies.〔Roth, S. (2014), "Fashionable functions. A Google ngram view of trends in functional differentiation (1800-2000)", ''International Journal of Technology and Human Interaction'', Band 10, Nr. 2, S. 34-58 (online: http://ssrn.com/abstract=2491422).〕 In this sense, the viewer represents a valuable research tool for digital humanities. ==Corpora== The corpora used for the search are composed of total_counts, 1-grams, 2-grams, 3-grams, 4-grams, and 5-grams files for each language. The file format of each of the files is tab-separated data. Each line has the following format:〔(【引用サイトリンク】publisher=Google )〕 *total_counts file ::year TAB match_count TAB page_count TAB volume_count NEWLINE *Version 1 ngram file (generated in July 2009) ::ngram TAB year TAB match_count TAB page_count TAB volume_count NEWLINE *Version 2 ngram file (generated in July 2012) ::ngram TAB year TAB match_count TAB volume_count NEWLINE The Google Ngram Viewer uses match_count to plot the graph. As an example, a word "Wikipedia" from the Version 2 file of the English 1-grams is stored as follows:〔googlebooks-eng-all-1gram-20120701-w.gz at http://storage.googleapis.com/books/ngrams/books/datasetsv2.html〕 The graph plotted by the Google Ngram Viewer using this data is (here ). 抄文引用元・出典: フリー百科事典『 ウィキペディア(Wikipedia)』 ■ウィキペディアで「Google Ngram Viewer」の詳細全文を読む スポンサード リンク
|